Introduction

Big data. It’s the buzz word. It’s something we recognize as important to driving and creating trust in our decision making process. We generate data quickly and in significant amounts, but it’s complicated — and quickly overwhelming. In 2017 the Harvard Business Review reported that:

  • “…less than half of an organization’s structured data is actively used in decision making…
  • 80% of an analyst’s time is spent discovering and preparing data, and …
  • An [organization’s] technology often isn’t up to the demands put on it …”

As a team who uses “all readily available and credible data” throughout the state, navigating and solving the challenges of working with big datasets from disparate data sources is something we continuously have to navigate. Through the Department’s first continuous improvement efforts, the Integrated Report’s (IR) Programming Team worked on re-evaluating and -imagining how we use technology to visualize and overcome the challenges with big data. We set out to become more efficient, agile, and credible in our decision making processes when reporting on the quality of the state’s surface waters. This summary discusses the challenges and successes from the IR team’s continuous improvements efforts over the last three years.


Challenges

Large scope

  • 855 assessment units
  • 622 use geographic use assignments
  • >1000 unique criteria
    • Site specific
    • Temoprally specific

Harmonizing datasets

Multiple data sources
1. Internal data
2. Partner agencies
3. Stakeholders

Interpretation and analysis issues:
1. Field & lab methods
2. Parameters
3. Site types

Batching assessments

Previous approaches:

  • Site by site
  • Parameter by paramter
  • Criterion by criterion


Digitizing criteria & uses

Water quality criteria

Water quality criteria are stored in mixed format text tables online. These criteria must be digitized and reformatted to a criterion database, including essential temporal and spatial metadata, to link criteria and data automatically.
Utah water quality standards



Native web format of [Utah water quality criteria](https://rules.utah.gov/publicat/code/r317/r317-002.htm){target='_blank'}.

Native web format of Utah water quality criteria.


Flattened database format of Utah water quality criteria.

Flattened database format of Utah water quality criteria.


Temporal & spatial components

Text format geographical beneficial use descriptions and site specific criteria needed to be digitized into geographical polygons to allow automatic assignment of uses and criteria to water quality monitoring locations.

Beneficial uses and site specific criteria polygons for Utah.

Formula derived criteria

Several criteria are sample specific, depending on other factors at the time of sampling like hardness, pH, or temperature. These formulas needed to be stored along with the criteria and automatically calculated for each applicable sample.

Hardness corrected criteria.


Other issues

  • Documentation
  • Communicating results
  • Treadmill timeline
  • Technical deficit

Opportunities

  1. Data in one usable format, connected to criteria and spatial and temporal metadata
  2. Significant resources focused on a uniform set of issues
  3. Build credibility and trust
  4. Multipurpose tools
  5. Understanding higher level information